RSA Explanation


If you have not tried the algorithm, it is suggested to test different prime pairs
and texts, and observe the encrypted and decrypted values.

RSA Algorithm

The RSA encryption algorithm is named after its founders, Rivest, Shamir, and Adleman. It relies on the principle that multiplying 2 primes is computationally simple but finding the prime factors of the product is difficult as no efficient classical algorithm is known.

Let p and q be distinct primes. Let N = pq. To verify is a number is prime, please use the prime check form on the right hand side of the Algorithm Demo page.

It is recommended to choose p and q such that N is over 122 to ensure two-way encryption of each English letter using the Unicode character set. However, for the sake for demonstration, let p = 5 and q = 2, such that N = 10.


The function Φ(N) must then be used, defined as the number of coprimes between 1 and N. The term coprime is a relationship between 2 numbers where they have no common factors aside from 1. For small values of N, e.g 10, it may appear convenient to manually count each coprime.

From 10, non-coprimes can be removed sequentially. This includes

  • 2
  • 4
  • 5
  • 6
  • 8
  • 10

Although this process is short for small N, the algorithm requires a sufficiently large N for greater security.

Therefore, a formulaic approach for counting coprimes with large N will be more appropriate, by either counting coprimes or removing non-coprimes.
By observing the list of removed co-primes, they follow a clear pattern; They are either multiples of p, q, or both. For example

  • 2
  • 4
  • 6
  • 8
  • 10

are all multiples of p, whereas 5 and 10 are multiples of q.
Therefore, to find Φ(N), we can subtract the multiples of p, subtract the multiples of q, and add back multiples of both p and q to ensure that 10, for example, is not counted twice.

And so: Φ(N) = N - (N/p + N/q - N/pq). Alternatively, if we consider non-coprimes of N as sharing a factor with either p or q, the numbers p, 2p, ... (q-1)p, qp and q, 2q, ... (p-1)q, pq are not corpime with N.



Therefore there are (p + q - 1) non-coprimes, as, similarly to the alternative method, pq is counted twice. The number of coprimes is thus N - (p + q - 1) = pq - p - q + 1, which factors into (p - 1)(q - 1), providing a simpler expression for Φ(N).

This expression is known as Euler's Totient function of N, when N is a product of 2 distinct primes.


Encryption and Decryption Key Generation

The encryption and decryption keys must now be generated such that they are unique and cannot be realistically discovered by attackers.

Firstly, Φ(N) must be calculated. Using the example, N = 10, Φ(N) = Φ(10) = 10 - (10/2 + 10/5 - 10/10) = 10 - (5 + 2 - 1) = 10 - 6 = 4. So Φ(10) = 4

To calculate the encryption key, let e be equal to the value of the encryption key. It must satisfy 2 conditions:

  1. 1 < e < Φ(N)
  2. e must be coprime with Φ(N)

Using the first condition against N = 10, only 2 and 3 satisfy this condition. And, when returning to the list of non-comprimes, 2 is removed, as it shares a common factor of 2 with Φ(N). Therefore, the only number satisfying both conditions is 3, as it has no common factors with Φ(N), so e = 3.


Let d = value of the decryption key. The decryption key must be generated by satisfying the rule de mod(Φ(N)) = 1, or alternatively, de ≡ 1 mod(Φ(N)). Substituting e = 3 and Φ(N) = 4 means 3d ≡ 1 mod(4)

As de ≡ 1 mod(Φ(N)), de = kΦ(N) + 1, and d = (kΦ(N)+1)/e, meaning d = (4k + 1)/3, and when testing small values of k, k follows a pattern;
k = 2, d = (8 + 1)/3 = 9/3 = 3, and when k = 5, d = (21/3) = 7.

It can be observed that k is in sequence

  • 2
  • 5
  • 8...

meaning that k = 3n - 1, so
d = (4(3n - 1) + 1)/3, and
d = (12n - 3)/3 meaning d = 4n - 1.

Therefore d can take multiple values, including

  • 3
  • 7
  • 11...

Any value in sequence 4n - 1 could be chosen, although to demonstrate that the RSA algorithm works for more than 1 decryption value, d = 7 will be chosen.


The keys have been calculated, with
e = 3 and d = 7. They can now be used for security purposes. In our simplified example, let the alphanumeric value of h = 8. When the encrypt button is pressed on the Algorithm Demo page, the following calculation is run:

(h^e)mod(N)
By substitution, 8^3(mod(10)) = 512(mod(10)) = 2, meaning the value of the ciphertext is 2, which can be represented with the variable b.


When the decrypt button is
pressed on the Algorithm Demo,
the following calculation is run using the value of b:

(b^d)mod(N) So 2^7(mod(10)) = 128(mod(10)) = 8, meaning the decrypted value is 8, which exactly matches
the initial plaintext when converted back to alphanumeric characters, returning "h".

h is returned. This works on arbitrarily long strings in this simultation
(although real RSA is primarily used to encrypt small pieces of data), so
the encrypted message can be arbitrarily large.

Citations

  1. Rivest, R. L., Shamir, A., & Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2), 120-126.
  2. Fagin, B. (2018). Composite Numbers That Give Valid RSA Key Pairs For Any Coprime p. Information, 9(9), 216.
  3. Jonsson, J., & Kaliski, B. (2003). Public-key cryptography standards (PKCS)# 1: RSA cryptography specifications version 2.1 (No. rfc3447).
  4. Rosendo, F. (1998) RSA: Public-Key Algorithms, i.web Labs.
  5. Mollin, R. A. (2002). RSA and public-key cryptography. Chapman and Hall/CRC.

Notes

We hope you enjoyed this explanation of the RSA encryption algorithm. Whilst the explanation is mathematically valid, there are certain points which should be addressed:

The real RSA algorithm uses extremely large values of N to ensure absolute security, whereas this website may be limited in regards to the maximum value of N due to website performance concerns. Additionally, this website uses other algorithms, such as modular exponentiation which were omitted in this explanation.

It is also recommended to test the algorithm to see how different messages are encrypted, and also to observe what happens when non-primes are entered.